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A discrete adjoint-based design methodology for unsteady turbulent flows on three- 
dimensional dynamic overset unstructured grids is formulated, implemented, and verifled. 
The methodology supports both compressible and incompressible flows and is amenable to 
massively parallel computing environments. The approach provides a general framework 
for performing highly efficient and discretely consistent sensitivity analysis for problems 
involving arbitrary combinations of overset unstructured grids which may be static, un¬ 
dergoing rigid or deforming motions, or any combination thereof. General parent-child 
motions are also accommodated, and the accuracy of the implementation is established 
using an independent verification based on a complex-variable approach. The method¬ 
ology is used to demonstrate aerodynamic optimizations of a wind turbine geometry, a 
biologically-inspired flapping wing, and a complex helicopter configuration subject to trim¬ 
ming constraints. The objective function for each problem is successfully reduced and all 
specified constraints are satisfied. 
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J number of cost function components 

K vHx X rrix linear elasticity coefficient matrix 

L Lagrangian function 

rrid size of vector D 

nif size of solution vector at fringe points 

ruh size of solution vector at hole points 

TUq size of solution vector Q 

nis size of solution vector at solve points 

rux size of vector X 

N number of time levels 

n time level 

n outward-pointing normal vector 

P rrih X niq pseudo-Laplacian matrix 

p pressure, also cost function exponent 

Q THq X 1 vector of volume-averaged conserved variables 

q TUq X 1 vector of conserved variables 

R TOs X 1 vector of spatial undivided residuals 

TZ rUx X rUx block-diagonal rotation matrix 

R 3x3 rotation matrix 

Rgcl residual of the static geometric conservation law (GCL) 

R-gcl TOs X 1 vector of Rgcl 

S Control volume surface area 

T 4x4 transform matrix 

t time 

u, V, w Cartesian components of velocity 

V TUq X TOg diagonal matrix of cell volumes 

V control volume 

W 3x1 face velocity vector 

X rUx X 1 vector of grid coordinates 

X 3x1 position vector 

X independent variable 

X, y, z Cartesian coordinate directions 

a interpolation coefficient 

/3 scaling parameter for incompressible continuity equation 

e perturbation 

9 angle of rotation, also blade pitch 

0c collective input 

0ic lateral cyclic input 

01 s longitudinal cyclic input 

A ruq X 1 flow-field adjoint variable 

Ag rUx X 1 grid adjoint variable 

p density 

T vHx X 1 translation vector 

T 3x1 translation vector 

V’ blade azimuth 

(jj cost function component weight 

Wi, UJ 2 frequencies of rotation in rad/s 

Subscripts / Superscripts 

c child motion 

/ fringe point 

h hole point 

i, j, k, TO, n indices 

in quantity at initial conditions 
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nh quantity at simply-connected neighbor 

p parent motion 

s solve point 

a:, y, z axis of rotation 

00 quantity at free-stream conditions 

* target quantity 

overbar volume-averaged or time-averaged quantity, also complement of a vector 
Symbols 

Diag diagonal matrix operator 

o Hadamard vector multiplication operator 

© extension of o to a vector-matrix product 


I. Introduction 

As access to powerful high-performance computing resources has become more widespread in recent years, 
the use of high-fidelity physics-based simulation tools for analysis of complex aerodynamic fiows has become 
increasingly routine. The availability and affordability of high-fidelity analysis tools has in turn stimulated an 
enormous body of research aimed at applying such tools to formal design optimization of complex aerospace 
configurations. A survey of the relevant literature shows that optimization methods based on the Euler 
and Reynolds-averaged Navier-Stokes equations have indeed gained a strong foothold in the design cycle 
for problems governed by steady fiows.Conversely, formal optimization methods for problems involving 
unsteady flow are also under development,^^® but in general are not as mature at the present time. This 
lag can be attributed at least in part to the increased computational cost typically associated with unsteady 
simulations. 

For gradient-based optimization of problems involving many design variables, the ability to generate 
sensitivity information at a relatively low cost is critical. Unlike forward differentiation techniques such as 
finite differencing,^® direct differentiation,^^ and complex-variable methods,the adjoint approach performs 
sensitivity analysis at a cost comparable to that of a flow solution and independent of the number of design 
variables.Efficient evaluation of sensitivities of an output with respect to all input parameters has led 
to numerous applications of adjoint-based methods in various areas of research and engineering. Some 
recent adjoint-based developments include a mathematically-rigorous approach to error estimation and mesh 
adaptation,simultaneous design of shape and active flow control parameters for a high-lift configuration,® 
efficient methods for uncertainty quantification,®® sonic boom optimization,®® laminar flow control,®®” and 
many others. 

Adjoint methods can be further classified into continuous and discrete variants, depending on the order 
in which the differentiation and discretization of the governing equations is performed. A discrete adjoint 
approach to sensitivity analysis is taken here. The methodology has been widely used for a broad class 
of optimization problems involving both steady and unsteady flows.®’®’®8-24 gj ^j^g advantages of 

the discrete adjoint approach is that the sensitivities computed by this method can be verified to machine 
precision by comparison with complex variable sensitivities.®® The approach requires a complete linearization 
of the discrete governing equations with respect to both the flow-field variables and mesh coordinates. Strictly 
speaking, the adjoint approach for unsteady fiows requires the evaluation of these linearizations at each 
physical time step. Therefore, the predominant challenge in extending a steady-state implementation to 
the unsteady regime is the development of an efficient infrastructure to store and access the entire forward 
solution as needed. 

The analysis of vehicles experiencing large relative motion of vehicle components is often accomplished 
using overset discretizations. Design optimization for unsteady fiows using such discretizations serves as the 
primary motivation for the current work. An implementation of the discrete adjoint approach for optimization 
of general three-dimensional unsteady turbulent fiows on single-block unstructured grids is described in Refs. 
3 and 5. Others have previously demonstrated adjoint-based capabilities for overset mesh discretizations; 
however, such works have been restricted to steady fiows. The methodology described here is intended 
for aerodynamic optimization of configurations characterized by large dynamic grid motions. 
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The primary contributions of this paper are the development, implementation, verification, and demon¬ 
stration of an adjoint-based methodology for optimization and design of the most general unsteady aero¬ 
dynamic flows. In the case of rotary wing flows, an optimization reported here involves a full helicopter 
configuration subject to trimming constraints and completes the series of studies addressing models of pro¬ 
gressively higher fidelity. The previously considered models include actuator disk approaches,noninertial 
formulations,and dynamic grid formulations involving isolated rotors.^ The generality of dynamic over¬ 
set unstructured grid methods makes this methodology applicable to the most general flows occurring in a 
variety of practical computational fluid dynamics applications, e.g., store/stage separation, turbomachinery, 
wind turbine systems, rotary wing systems, biologically-inspired devices, and many others. Several diverse 
large-scale design applications are demonstrated in this paper. 

The material is presented in the following order. The governing equations and some fundamental concepts 
of overset mesh systems are presented first. The approach taken to solve the flow equations is reviewed, 
followed by a derivation of the accompanying discrete adjoint equations. Details of the solution strategy 
are covered and the accuracy of the implementation for a very general dynamic motion case is verified 
using an independent approach based on complex variables. Finally, successful demonstrations of the design 
methodology are shown for a wind turbine geometry, a biologically-inspired flapping wing, and a realistic 
helicopter configuration. The appendix contains derivations for high-order temporal schemes. 

II. Governing Equations 

In this paper, the unsteady turbulent flow equations are used in both compressible and incompressible 
formulations. The primary distinction between these formulations is that the incompressible continuity 
equation does not have a time derivative term; all other (compressible and incompressible) equations do 
have time derivatives. For a simultaneous description of the unsteady compressible and incompressible 
Navier-Stokes equations, it is convenient to introduce an indicator of time derivative, C, and a Hadamard 
vector multiplication operator.The vector C is a logical vector composed of zeros and ones and has the 
same dimension as the residual vector. Ones correspond to equations with time derivatives, while zeros 
correspond to equations with no time derivatives. The logical complement to C, C, is a vector of the same 
dimension in which zeros are replaced by ones and vice-versa. The Hadamard operator is denoted as o and 
acts on two vectors of the same dimension, which are multiplied in an element-by-element fashion. The result 
of the Hadamard multiplication is a vector of the same dimension. The simultaneous description of the flow 
equations involves the Hadamard multiplication of the vector C with the vector of time derivatives. The 
resulting equations can be written in the following form for both moving and stationary control volumes; 

C o ^ / qdF + / {Yinv - F„iac) • MS = 0, (1) 

(Jt Jv JdV 

where V is the control volume bounded by the surface dV. The vector q represents the conserved variables 
for mass, momentum, and energy, and the vectors Fj„„ and F„jsc denote the inviscid and viscous fluxes, 
respectively. 

For a moving control volume, the viscous flux is unchanged while the inviscid flux vector accounts for the 
difference in the fluxes due to the movement of control volume faces. Given an inviscid flux vector F on a 
static grid, the corresponding fluxFj„„ on a moving grid can be defined as Fj„„ = F—(Coq-|-C)(W-h), where 
W is a local face velocity and n is an outward-pointing unit face normal. In other words, Fi„„ = F —q(W-h) 
for a conservation equation with a time derivative and Finy = F — (W • h) for an equation without a time 
derivative. 

By defining a volume-averaged quantity q within each control volume, 

q=:^y^qdF, (2) 

the conservation equations given by Eq. 1 take the form 

C o / (Finv - Fyisc) ■ MS = 0. (3) 

Jqv 

Here the conserved variables and inviscid flux vectors for compressible flows are defined as q = [p, pu, pv, pw, E]'^ 
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and 
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and the perfect gas equation of state is assumed. The corresponding vectors for incompressible flows are 
q = [p, u, V, w]'^ and 
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where /3 is a scaling parameter analogous to the artificial compressibility parameter. Recall, however, that 
the incompressible continuity equation does not have a time derivative. The viscous flux vector is not 
explicitly shown here. For turbulent flows, the equations are closed with an appropriate turbulence model 
for the eddy viscosity. 

The high-order (up to third-order) backward difference (BDF) discretizations for the time derivative of 
a function s are defined as 

^ ^ (as" -I- 6s"“^ -t- cs"“^ -I- ds"“^), (6) 

where n is a time level, and the coefficients are given in Table 1. The number after the BDF abbreviation 
indicates the order of the scheme. The coefficients listed for the BDF2opt scheme are a linear combination of 
the BDF2 and BDF3 coefficients taken from Refs. 33 and 34. The resulting scheme is second-order accurate 
but has a leading truncation error term less than that of the BDF2 scheme. 

Using a BDFl scheme, the discrete form of the flow equations at time level n is given as 


C 


qnyn _ ^n-lyn-1 

At 


-k R" = 0, 


(7) 


where U" and q" are a control volume and the corresponding solution vector at time level n and R" is 
a vector of spatial undivided residuals approximating the flux term in Eq. 3. The first-order temporal 
scheme is chosen for the sake of simplicity; higher-order BDF schemes are used in practical computations 
and the demonstrations below. The Arbitrary Lagrangian-Eulerian (ALE)^® node-centered finite-volume 
discretization of Eq. 3 used in the current work and described in Ref. 36 employs the following discrete 
formulation; 

c o V " + R" + R5cl(C o q"-i + /3C) = 0 . (8) 


Here, 


Rgcl — 


W" • hdS, 


(9) 


where W" is a vector of local face velocities at time level n. Note that substituting a spatially and temporally 
constant state vector, q, into Eq. 7 results in a geometric conservation law (GCL)^'^ 


yn _ yn-l 

At 


- Rgcl — 0 


( 10 ) 


for an equation with a time derivative and 


-pRlcL = 0 ( 11 ) 

for the incompressible continuity equation. Eq. 8 is obtained by subtracting the GCL residual, multiplied 
by q"“^ for unsteady equations, from Eq. 7. 
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III. Overset Grids 


An overset grid formulation is characterized by the presence of two or more overlapping component grids. 
Each grid point and its corresponding control volume may be classified as one of four types based on the 
nature of the equation to be solved for that control volume. “Solve” points are points at which the discretized 
flow equations given by Eq. 8 are defined. “Eringe” points are points in overlap regions where interpolated 
data is specified in lieu of boundary conditions. The equations defined at fringe points are interpolation 
equations such that the solution at a fringe point, q/, is defined as a linear combination of solution values 
at solve points, q*: 

q/-^a'=q,^ = 0, = (12) 

Typically, the fringe point and the solve points appearing in Eq. 12 belong to different overlapping component 
grids. “Hole” points refer to points outside the computational domain, e.g., within the boundaries of a wing. 
In the current approach, the solution at hole points, q/i, is set to the average of the solution values at its 
simply-connected neighbors, q^^. This averaging procedure is equivalent to a discrete pseudo-Laplacian, 
which is an elliptic operator; 

E(q'‘-C6) = 0, (13) 

where the hole point neighbors are identified by j. Einally, “orphan” points refer to grid points located 
within the computational domain for which neither the flow equations are imposed, nor can suitable points 
be found from which to interpolate solution information. In the current effort, the same pseudo-Laplacian 
procedure is defined for hole and orphan points, so that orphan points are not explicitly considered as a 
separate entity in the formulation to follow. 

Eor dynamic grid motions, the character of each grid point may change as a function of time. It is 
preferable to have grid topologies such that the residuals of the governing equations at solve and fringe 
points do not depend on solution values at hole points. At a minimum, hole-point solutions should not 
contribute to residuals at solve and fringe points within the same time level. In practice, it can be difficult 
to prevent solutions at hole points from contributing to residuals at solve points through the time derivative; 
however, maximizing the extent of fringe regions and reducing the time step can help to alleviate this 
difficulty. 

The domain-connectivity information required by the overset implementation is established using the 
software libraries described in Ref. 38. This methodology has been used extensively with the flow solver for 
performing analysis of multibody problems undergoing large relative motions.^°’^®’^®“^® Given the topology 
of each component grid, each grid point in the composite grid is determined to be a solve, fringe, hole or 
orphan point. This procedure is performed dynamically during the solution process as required by the grid 
motion. The mesh elements containing fringe points are established and the weighting coefficients required 
to interpolate data at such points are evaluated. Eor cases in which the grid motion is periodic, the user may 
choose to store the domain-connectivity information during the first cycle of motion for use in subsequent 
cycles. Once the interpolation coefficients are known, the complementary library described in Ref. 46 is used 
to determine the current solution at fringe points. The solution at hole and orphan points is determined based 
on user-supplied subroutines specifying the desired treatment at such locations. In the current approach, 
the pseudo-Laplacian given by Eq. 13 is used. 

IV. Flow Solver 

References 23,36, and 47-49 describe the flow solver used in the current work. The code can be used 
to perform aerodynamic simulations across the speed range and an extensive list of options and solution 
algorithms is available for spatial and temporal discretizations on general static or dynamic mixed-element 
unstructured meshes which may or may not contain overset grid topologies. 

In the current study, the spatial discretization uses a finite-volume approach in which the dependent 
variables are stored at the vertices of tetrahedral meshes. Inviscid fluxes at cell interfaces are computed 
using the upwind scheme of Roe,®° and viscous fluxes are formed using an approach equivalent to a finite- 
element Galerkin procedure. The incompressible implementation is based on Refs. 49 and 51. Eor dynamic 
mesh cases, the mesh velocity terms are evaluated using backward differences consistent with the discrete 
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time derivative; this makes the spatial and GCL residuals dependent on grids at previous time levels. The 
eddy viscosity is modeled using the one-equation approach of Spalart and Allmaras.^^ The turbulence 
model is integrated all the way to the wall without the use of wall functions and is weakly coupled, i.e., 
solved separately from the mean flow equations at each time step. Scalability to thousands of processors 
is achieved through parallel domain decomposition, pre-processing, and solver mechanics. Post-processing 
operations such as the generation of isosurface and computational schlieren animations are also performed 
in parallel, avoiding the need for a single image of the mesh or solution at any time and ultimately yielding a 
highly efficient end-to-end parallel simulation paradigm. To date, this approach has been used to carry out 
computations on meshes containing as many as two billion points and twelve billion tetrahedral elements. 

To collectively describe equations and solutions defined at solve, fringe, and hole points, it is convenient to 
introduce corresponding projectors I”, ly, and IjJ at time level n. These operators are rectangular matrices 
of respective dimensions m® x nrif x mg, and mh x mg, and whose rows contain a single unity entry 
complemented by zeros. The values mg, mf, and mh are the solution dimensions at all solve, fringe, and hole 
points, respectively, and mg = mg+ mf + mh is the solution dimension at all grid points. The projectors are 
used to extract solutions at grid points of a specific type; Q” = I^Q”, = I^Q", and QjJ = IjJQ", where 

Q” is the vector of solution values at all grid points and Q”, Qj, and QjJ are the vectors of solution values 
at solve, fringe, and hole points, respectively. Finally, note that the projector operators can vary in time. 

The discrete form of the flow equations with a BDFl scheme for the time derivative at time level n can 
be written as 

C> V> ' + R" + [(i:Q”-') o C: + /3C^] o = 0. (14) 

In Eq. 14 and all discussions to follow, R" and Rccl rus x 1 vectors that include residuals at solve 
points, V” is an mg x 1 vector of control volumes for all equations at time level n, V™ = I"V” is a subset 
of V” corresponding to solve points, C" is an rUg x 1 vector-indicator of a time derivative restricted to solve 
points at time level n, and C" is the complement of C". Note that a solve point at time level n may or may 
not be a solve point at time level n — 1. 

The equations at fringe points are defined as 


A"Q" = 0, (15) 

where A" is the mf x mg matrix defining the interpolation of solution data from overset grid solutions at 
time level n as introduced in Eq. 12. The equations at hole points are defined as 

P”Q” = 0, (16) 


where P” is the mh x m, matrix of the pseudo-Laplacian given by Eq. 13. 

The Jacobian of the implicit Eqs. 14, 15, and 16 at time level n is a 3 x 3 block matrix of the form 


^Diag(C>V^) 

A™ 

P” 




(17) 


where Diag(Cg o V”) is a diagonal mg x mg matrix with the vector C” o V” on the main diagonal; A’J is an 
mf X mf diagonal matrix describing interpolation at fringe points; A" and AjJ are matrices with respective 
dimensions mf x mg and m/ x mh describing interpolation from solve and hole points; and P", FJ, and PJJ 
are matrices with respective dimensions mh x mg, mh xmf, and mh x mh describing contributions of solve, 
fringe, and hole points to the pseudo-Laplacian defined at hole points. Note that if the solution at hole points 
does not contribute to residuals at solve and fringe points within the same time level, then 9R"/9QjJ = 0, 
AjJ = 0, and the equations at hole points decouple from the equations at solve and fringe points. 


V. Grid Equations 

The general grid equations can be defined in the form 

G”(X,D)=0, (18) 
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where the x 1 vector X represents the coordinates of the composite overset mesh (meshes at several time 
levels may be involved), D is the vector of design variables, and n denotes the time level and indicates that 
the grid operator may vary in time. Moreover, different grid operators G" may be specified for different 
component grids. The specific formulations for different grid motions are introduced next. 

V.A. Grids Undergoing Rigid Motion 

For problems in which rigid mesh motion is required, the motion is generated by a 4 x 4 transform matrix, T, 
as outlined in Ref. 36. This transform matrix enables general translations and rotations of a grid according 
to the relation 

X = Tx°, (19) 

which moves a point from an initial position x° = (a:°, tp, to its new position x = (a:, y, z)^: 



In an expanded form, x = RsP + t. Here, the 3x3 matrix R defines a general rotation and the vector r 
specifies a translation. The matrix T is generally time dependent. One useful feature of this approach is that 
multiple transformations telescope via matrix multiplication. This formulation is particularly attractive for 
composite parent-child body motion, in which the motion of one body is often specified relative to another. 
The reader is referred to the discussion in Ref. 36 for more details. For a rigid-motion formulation, the grid 
operator at time level n is defined as 

G” (X", X°, D) = 7^"X° -h r” - X”, (21) 

or in abbreviated notation, 

G”(X”,X°,D) = T”X°-X". (22) 

Here, X° and X” are the grid vectors at the initial and n-th time levels, respectively; TZ^ is an x 
block-diagonal matrix with 3x3 blocks representing rotation and rUx being the size of vector X”; and r” 
is an nix-size translation vector. The matrix 7?." and vector r” may explicitly depend on D. 

V.B. Deforming Grids 

The simplest example of a deforming grid simulation is a static grid undergoing deformations as a result 
of a shape optimization process. In this case, the grid is not time-dependent and is modeled as an elastic 
medium that obeys the elasticity relations of solid mechanics. An auxiliary system of linear partial differential 
equations (PDFs) is solved to determine the mesh coordinates after each shape update. Discretization of 
these PDFs yields a system of equations 

K (X - X) = Xbound - Abound, (23) 

where K represents the elasticity coefficient matrix, X is the vector of grid coordinates being solved for, 
X is the vector of coordinates of a reference grid, and 'Kbound and Abound are the vectors of corresponding 
boundary coordinates, complemented by zeros for all interior coordinates. The coefficients of the matrix K 
depend on X. The material properties of the system given by Fq. 23 are chosen based on either the local 
cell geometry or proximity to the surface and are invariant with respect to coordinate transformations. The 
system is solved using a preconditioned generalized minimal residual algorithm. For further details on the 
approach, see Refs. 19,36, and 54. 

For static grid deformation, the only grid operator used at all times is 

G(X, D) = -K (X - X) + Xb„„„rf - %f,ound, (24) 

where 'K.^ound may explicitly depend on D, X is an independent grid obtained either from a grid generator or 
from the previous optimization iteration, and 'Abound is the vector of corresponding boundary coordinates. 
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When time-dependent deforming grids are required, the rigid motion as described in the previous section 
is not valid. For small relative grid deformations, the linear elasticity equations given by Eq. 23 are solved 
at each time level with the matrix K = K° computed at the initial time level and fixed throughout the time 
evolution; includes the description of the current body positions. The grid operator at time level n 

is defined as 

G"(X", D) = -RO (X" - X) + - %ounci. (25) 

V.C. Parent-Child Motions 

Large relative motions are described through parent-child relations, in which the collective motion of a child 
body is described as the product TyT^, where Tp is the collective parent transform matrix (which itself 
can be a chain of parent-child products) and Tc is the transform matrix describing the motion of the child 
with respect to the parent. In the current implementation, there is a one-to-one correspondence between 
moving bodies and component grids. Additional static grids may be associated with the non-inertial frame. 
Thus, a transform matrix describes not only the body motion, but may also describe the transformation of 
the corresponding grid. In general, a parent-child chain of motions can include an arbitrary combination of 
rigidly moving and deforming overset grids. If a component grid, X”, is designated as rigid, then all nodes 
of this grid undergo the same motion described as 

G” (X”, X", D) = -X" + TpTeXO. (26) 

If a component grid is designated as deforming, then the initial grid, X°, is either given, 

G°(X°,D) =-X°-hX, (27) 

or computed from the elasticity equations, Eq. 25. The corresponding body surface undergoes the TpTc 
motion, the external boundary and the initial (reference) grid undergo the Tp motion, and the grid at time 
level n, X”, satisfies the elasticity relations 

G”(X”, X°, D) = -K" (X" - TpX°) -h X^ound - T^p'^lound- (28) 

Here, the matrix K" is computed using the moved initial grid TpX°. Note that because of invariance of the 
material properties of the elasticity system, the following identity holds: 

K"Tp = TpK°. (29) 

In the current implementation, if any component grid is designated as deforming, then the entire composite 
grid is designated as deforming, and all component grids are treated as deforming, including those component 
grids that are in fact rigid. In this scenario, the external boundaries and the reference grid of a rigid 
component grid are moved with the collective motion of the corresponding body, TpT^, the boundary 
variations in Eq. 28 become zero, and the obtained grid, X", is equivalent to the rigidly moving one, Eq. 26. 
If all component grids are labeled as either rigid or static, then the composite grid is designated rigid, and 
all grid points are moved according to Eq. 26. 

VI. Cost Functions and Design Variables 

The steady-state adjoint implementation described in Refs. 18-24 permits multiple objective functions 
and explicit constraints of the following form, each containing a summation of individual components: 

U = (30) 

Here, uij represents a user-defined weighting factor, Cj is an aerodynamic coefficient such as the total 
drag or the pressure or viscous contributions to such quantities, the superscript * indicates a user-defined 
target value of Cj, and pj is a user-defined exponent. Targets are chosen to encourage beneficial changes 
in the design parameters and are typically far enough from the baseline values to avoid limiting potential 
improvements. The exponent values are chosen so that /* is a convex functional, which is important for 
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convergence of gradient-based optimization. The user may specify computational boundaries to which each 
component function applies. The index i indicates a possibility of introducing several different cost functions 
or constraints, which may be useful if the user desires separate sensitivities, for example, for lift, drag, 
pitching moment, etc. The implementation also supports multipoint optimization.^*^ 

For the unsteady formulation, similar general cost functions /” are defined at each time level n. The 
accumulated cost function /* can be defined as a discrete sum over a certain time interval 

Ni 

fi=i^ /r, ( 31 ) 

n=Nl 

where time levels N} and Nf correspond to t\ and tf, respectively. The corresponding time integral is 
approximated as /iAt. The current study also introduces an additional cost function of the following form, 
which is based on the time-averaged value of an output; 

^_ f 



The user supplies time intervals over which the cost functions are to be used. 

There are three classes of design variables available in the current implementation. The first is composed 
of global parameters unrelated to the computational grid. These variables include parameters such as 
the free-stream Mach number and angle of attack. Such variables are particularly useful in verifying the 
implementation of the flow-field adjoint equations. 

The second class of design variables provides general shape control of the configuration. The implemen¬ 
tation allows the user to employ a geometric parameterization scheme of choice, provided the associated 
surface grid linearizations are available. For the examples in the current study, the grid parameterization 
approach described in Ref. 55 is used. This approach can be used to define general shape parameterizations 
of existing grids using a set of aircraft-centric design variables such as camber, thickness, shear, twist, and 
planform parameters at various locations on the geometry. The user also has the freedom to associate design 
variables to define more general parameters. In the event that multiple bodies of the same shape are to be 
designed — such as a set of rotor blades — the implementation allows for a single set of design variables to 
be used to simultaneously define such bodies. In this fashion, the shape of each body is constrained to be 
identical throughout the course of the design. 

Finally, the third class of design variables governs any kinematics that may be present. The user may 
invoke simple translation and rotation functions native to the solver; in this case, basic parameters such 
as frequencies, amplitudes, directional vectors, and centers of rotation are available as design variables. 
Alternatively, more complicated kinematics and associated design variables may be supplied through a 
user-defined subroutine satisfying a standard interface. This interface is wrapped with a complex-variable 
perturbation scheme*^^ to numerically evaluate the Jacobian of the specified kinematic motion which is 
required by the adjoint formulation to follow. 

VII. Adjoint Equations 

The goal of the design optimization problem for unsteady flows is to choose the design parameters D 
to minimize an objective function, fobj = f^t, where / is posed by Eqs. 31 or 32 and the subscript i is 
omitted. For the sake of clarity, the formulation to be presented here is based on a BDFl scheme for the 
time derivative as introduced in Eq. 14. The derivation for higher order BDF schemes is similar and is 
presented in the appendix. Following the methodology described in Refs. 5 and 56, a Lagrangian function 
is defined as 

T (D, Q, X, A, Ag) = /Af + ([AO] ^ G° + [AO] ^ At 

+ E { [A”] ^ G- + [a^] ^ [A"Q"] + [A^]^ [P"Q"] (33) 

+ [A?]^ [c^ O V? O + Rn + ((l^Qn-i) o + /3C?) O } At 
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Here, A", AJ, and A^ are m* x 1, m/ x 1, nih x 1, and rrix x 1 vectors of Lagrange multipliers associated 
with the solve, fringe, hole, and grid equations, respectively; [A”]^ = |[A”]^, j^A^j , [A^]^|; A" = I"A", 


Aj = IjA", and AJJ = IJJA"; and R®” = 0 represents the initial conditions. A typical form of the initial 
conditions is R®" = V° o — Q°), where Qoo is the free-stream solution; other forms, such as a steady- 
state initial solution, are also possible. 

The Lagrangian given by Eq. 33 is differentiated with respect to D, assuming that V®® depends on X®®; 
G®® depends on X®®, X°, and D; R®® depends on Q®®, X®®, X®®“^, and D; 'R,qcl depends on X®®, X®®"^, and 
D; A®® depends on X®®; G° depends on X° and D; R®®® depends on Q°, X°, and D; and P®®, C", C”, I", 
Ip and IJJ are independent of grid coordinates, solutions, and design parameters. 

Regrouping terms to collect the coefficients of 9Q®®/9D and equating those coefficients to zero yields the 
adjoint equations; 


S: 


F ; 


H: 


ijC” o V> A- + [fg] ^ A? + [A-]^ A-f + [P”]^ Al = 


^ ° (- it+ RS^i) o A-+1] , 

A-+ [Alf A]+ [PlfAl = 

^ [Cr' o (- ^+ Rl+\) o A?+i] , 

[ ^ AO = - ^ - [II] ^ [C^ o (- ^VI + R^^ J o Ai] 


for I <n< N] 
for n = 0, 


(34) 


where A^+^ = 0. The preceding letters indicate the type of points at which the equations are defined; S', 
T, and H correspond to solve, fringe, and hole points, respectively. Collecting the coefficients of 9X'®/i9D 
and equating those coefficients to zero in a similar fashion yields the grid adjoint equations; 


-[pf Ap + 

+ + ((ir*Q"‘-') o c;+* + / 3 c;+‘) 0 ® a;+‘ + [ J,] ®, /«• 1 < n < jv; 



(35) 

Here, S//9X®® is a 1 x rria, row vector, 9G®®/5X®® is an rrix x nix matrix, 9V"/9X®®, 9R'®/9X®®®, and 
dRQQi^/d'X."^ are nis x nix matrices, 9(A'®Q®®)/9X®® is an m/ x nix matrix, and SR®®®/9X0 is an m, x nix 
matrix. The operation 0 is an extension of the Hadamard multiplication to a product between an mg x 1 
vector and an x m matrix, where the second matrix dimension, m, is arbitrary. The operation indicates 
that the vector multiplies the columns of the matrix in an element-by-element fashion resulting in a new 
nis X m matrix. 

When considering the linearization of A®®, the domain-connectivity information is assumed to be fixed. 
That is, the weighting coefficients represented by this matrix are considered functions of the mesh coordinates; 
however, the interpolating elements are considered constant so that the hole-cutting and domain-connectivity 
algorithms need not be linearized. 

With Lagrangian multipliers satisfying equations Eqs. 34 and 35, the sensitivity derivatives are calculated 
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as follows; 


§5 - + S [^ff] [ w + ((I”Q" ^) o C” + /3C^) 0 Af 

(36) 

+ ([A°]"^+[A°]"[^])Af, 

where dL/dT) and df/dT> are 1 x row vectors, OG^/dD is an nix x md matrix, dR^/dT) and 9 Rq(^^/i 9D 
are nig x matrices, and 5R™/9D is an m, x nid matrix. 

To facilitate the solution of Eqs. 34 and 35, the values of X", 9X"/9f, and Q” are stored to disk at the 
conclusion of each physical time step of the flow solution using a strategy designed to minimize file system 
overhead. The approach is based on a massively parallel paradigm in which each processor writes to its own 
unformatted direct-access file at each time step. The data writes are buffered using an asynchronous paradigm 
so that execution of floating point operations for the subsequent time step may proceed simultaneously. This 
approach is described and evaluated in Ref. 3 and has been found to scale well to several thousand processors 
using a parallel file system. Rather than recompute the domain-connectivity information during the adjoint 
solution procedure, a similar I/O paradigm has been implemented to efficiently store this information to 
disk, although the size of this data is typically an order of magnitude less than the flow-field data. During 
the solution of Eqs. 34 and 35, data is loaded from disk using a similar paradigm but in reverse, such that 
data required for the solution at time level n — 1 is pre-loaded during the computations for time level n. 

VIII. Iterative Solution of Equations at Each Time Level 

When solving the flow equations, the value of is taken to be an initial approximation for Q". The 
solution of Eqs. 14, 15, and 16 at time level n is obtained through the following iterations, which exploit the 
form of the Jacobian matrix given by Eq. 17; 

F ; AjAQj’ = — -|- A^Q^’ -1- Aj(Q^’ j , 

= Q/’" + AQ;/’’", (37) 

5 ; [^Diag(V^) + ijDiag(C^ o V^) + AQ?>- = 

_ [c^ O , yn + + ((I-Q-I) o + /3C^) O Rg^ J , ^33^ 

Qn,m+1 _ Qn,m _j_ 

P»iQr = -[p:Q;-«+p?Qr"'+n-Qr]. 

= Q”''" + aq;;’'". 

Here, the second superscript m is the iteration count, R"-'" is the spatial non-linear residual computed 
for the most recent solution that involves At is a pseudo-time step, and 9R"’'"/9Q” is the Jacobian 

of a first-order spatial discretization. 

At each iteration, Eq. 37 is solved exactly because A^ is a diagonal matrix, and the fringe solutions 
are updated first. An approximate solution of the linear system of equations (Eq. 38) is obtained through 
several iterations of a multicolor Gauss-Seidel point-iterative scheme, followed by a solution update for 
Q"’™+i. Einally, Eq. 39 is relaxed and solutions at hole points are updated. The convergence rate of the 
solution at hole points is typically the slowest; relaxation of the pseudo-Laplacian operator is known for poor 
convergence behavior. If the solution at hole points is decoupled, then its value may be updated only once 
after the solution at flow and fringe points has been converged. 

The adjoint equations are solved backwards in time. The solution procedure outlined here is based on 
the single-grid implementation which has been previously verified for turbulent flows on three-dimensional 
unstructured grids undergoing general dynamic motions.® The iterative solution of the adjoint equations 
given by Eq. 34 at time level n is performed in precisely the reverse order as the iterations given by Eqs. 
37-39; 




( 40 ) 


H : [Vlf AA"’- = _ ^ _ Ijj [in+i] ^ [c?+i o (- ^ V?+i + Kl+\) ° '] 

- [Plf - [flj] A-.- - [A-]^ A™’-, 

^n,m+l ^ ^n,m ^ 


5 : [^Diag(V?) + A^Diag(C^ o V”) + AA?>- = 

- [z^] ^ - I” (-i* o A^+i] 

.^C> V> A-.- - [fg] ^ A-.- - [A?]^ A”-™ - [P-]^ Al^ 


(41) 


^n.m+l ^ j^n 


F : [a;;] ^ AA”-™ = - [^] ^ - I« [l^+i]^ [C^i o (-A,V?+i + R^+i^) o A^+i] 

[AfAr.[pfAr-, 

A?'"+^=Ar+AAr- 

Solutions for the grid adjoint equations are obtained through relaxation of Eq. 35. 


(42) 


IX. Verification of Adjoint Implementation 


To verify the accuracy of the implementation, comparisons are made with results generated through an 
independent approach based on the use of complex variables. This approach was originally suggested in Refs. 
12 and 57, and was first applied to a Navier-Stokes solver in Ref. 58. Using this formulation, an expression 
for the derivative of a real-valued function f{x) may be found by expanding the function in a complex-valued 
Taylor series, using an imaginary perturbation ie: 


dx e ^ ' 


(43) 


The primary advantage of this method is that true second-order accuracy may be obtained by selecting 
step sizes without concern for subtractive cancellation errors typically present in real-valued Frechet deriva¬ 
tives. Through the use of an automated scripting procedure outlined in Ref. 59, this capability can be 
immediately recovered at any time for the baseline flow solver. For computations using this method, the 
imaginary step size has been chosen to be 10“®°, which highlights the robustness of the complex-variable 
approach. For each verification test, all equation sets are converged to machine precision for both the 
complex-variable and adjoint approaches. Since the package described in Ref. 46 cannot directly accom¬ 
modate complex-valued grids and solutions, the integer-valued donor and receptor information is instead 
transferred to the solver, which performs the requisite complex-valued donor weight computations and solu¬ 
tion interpolations. This procedure has been verified to produce identical real components as compared to 
the routines internal to the package of Ref. 46. 

The test case used to verify the accuracy of the implementation is based on the rotorcraft configuration 
shown in Fig. 1. The conventional rotorcraft definition for the azimuth angle 'tp is also shown in the figure. 
The fuselage is described by a component mesh consisting of 88,001 nodes and 505,437 tetrahedral elements. 
Each of the four rotor blades is modeled using a component grid containing 103,296 nodes and 601,459 
tetrahedral elements. The entire configuration is combined with a background grid consisting of 50,156 
nodes and 285,587 tetrahedral elements to yield a composite mesh system with 551,341 nodes and 3,196,860 
tetrahedral elements. 

A very general combination of forced motions is applied to the configuration as follows. The fuselage mesh 
is subjected to a rigid fixed-rate rotational and translational motion in the starboard direction. The motion 
of each rotor blade is treated as a child of the fuselage motion, and consists of an additional rigid fixed-rate 
rotation in the azimuthal direction. Each blade is also subjected to a final child motion consisting of a forced 
vertical flapping that is modeled as a 1° oscillatory rotation about the rotor hub with a two-per-revolution 
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frequency, and is accommodated with the deforming mesh mechanics. The background mesh is held fixed 
in inertial space. The overall motion of the configuration is shown in Fig. 2, while the vertical extent of 
the blade tip motion due to flapping is shown in Fig. 3. In summary, the composite motion is a family of 
four generations, occurring in the following ancestral order from oldest to youngest; inertial reference frame, 
fuselage motion, azimuthal blade motion, and flapping blade motion. 

For the verification of the compressible implementation, the free-stream Mach number is 0.1 and the 
Reynolds number is 4.2 million based on the blade tip speed and chord, and fully turbulent flow is assumed. 
A similarly scaled Reynolds number of 3.1 million is used for the incompressible verification. The angle 
of attack is 2°, and the advance ratio is 0.12. The physical time step corresponds to 1° of rotation in the 
azimuthal direction. All of the computations are performed using 128 processors. 

Sensitivity derivatives of the lift coefficient for the entire vehicle after five physical time steps are computed 
using the discrete adjoint and complex-variable approaches. Although the coarse spatial resolution and brief 
duration of the simulation are not sufficient to resolve the flow physics of the problem, they are adequate 
to evaluate the discrete consistency of the implementation. Table 2 shows the compressible flow sensitivity 
derivatives with respect to angle of attack, variables characterizing the rigid-body motions, and parameters 
describing the blade and fuselage shape. Results are shown for all of the temporal BDF schemes discussed 
in Section II and the appendix. Analogous results for the incompressible formulation are shown in Table 3. 
The results from the discrete adjoint and complex-variable approaches are in very good agreement for all 
cases; non-matching digits in the sensitivities are underlined. 

X. Large-Scale Test Cases 

To evaluate the proposed design methodology, aerodynamic optimizations are performed using three large- 
scale test cases. The goal is solely to demonstrate the ability of the implementation to successfully reduce 
each of the stated objective functions while satisfying any constraints present. While details pertaining to the 
underlying flow physics clearly may be of interest in each case, investigations of that nature are considered 
beyond the scope of the current effort and are not explored here. 

For each case shown below, the spatial and temporal grid resolutions have been chosen based on a suitable 
compromise between solution accuracy and computational efficiency. Each optimization is performed on an 
SGI ICE system using dual-socket hex-core nodes with Intel Xeon X5670 cores in a fully-dense configuration. 
A single additional node is allocated for serial execution of the dynamic hole-cutting library. The computa¬ 
tional environment also includes a Lustre-based parallel file system,®^ and computational statistics include 
any disk I/O time required to read or write the complete flow-field solution. 

As described above, the implementation supports very general motions including the use of deforming 
bodies. However, physical models typically responsible for such effects — such as structural models — 
generally are a strong function of the aerodynamics and require a formal coupling procedure. While the flow 
solver used in the current study can accommodate such models, the adjoint formulation does not account 
for such effects at this time. Therefore, to evaluate the current methodology, all large-scale simulations 
described here rely on forced motions. Development of a more general adjoint formulation required for 
coupling aerodynamics with other disciplinary models is relegated to future work. 

X.A. NREL Phase VI Wind Turbine 

The first test case is based on the NREL Phase VI wind turbine described in Ref. 61. The geometry is a 
two-blade upwind configuration with a nacelle and tower. The grid system used here has been developed in 
Ref. 43. The component grid for each blade consists of 4,510,177 nodes and 26,574,786 tetrahedral elements, 
and a separate component grid containing the nacelle and tower geometries consists of 971,059 nodes and 
5,716,227 tetrahedral elements. The background mesh consists of 4,776,082 nodes and 28,278,639 tetrahe¬ 
dral elements. The resulting composite mesh system contains 14,767,495 nodes and 87,144,438 tetrahedral 
elements. Views of the configuration and surface meshes are shown in Eig. 4. 

The simulation is fully turbulent and is performed using the incompressible form of the governing equa¬ 
tions. Standard sea-level conditions are used with a free-stream velocity of fifteen meters per second aligned 
with the axis of rotation. The radius of the blades is 5.029 meters and the system rotates at a speed of 
seventy-two RPM. The BDP2opt time integration scheme is used with 100 subiterations and a physical time 
step corresponding to 1° of blade rotation. Solutions are run for 720 time steps or two complete revolutions 
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of the blades. The torque profile for the baseline geometry is shown as the solid line in Fig. 5. After a series 
of initial transients, the solution quickly settles into a quasi-steady state behavior. The mean value of the 
torque coefficient Cq measured over the second revolution is 0.00130. An isosurface of the Q criterion®^ is 
included in Fig. 6. 

The goal of the current test case is to maximize the torque acting on the turbine by altering the blade 
geometry. The objective function is based on torque values Cg, which do not include the nondimensional- 
ization using the reference geometry, and is posed as a discrete summation of the intermediate torque value 
minus a constant target value over the second revolution; 


fob] 


Y, (^3 - 2.0)2At. 

n=361 


(44) 


The target value of 2.0 has been chosen based on the initial Cq profile. The objective function could also 
be formulated in terms of nondimensional torque values; in this case, the target value should be rescaled 
accordingly. There are a total of 76 design variables as shown in Fig. 7. These include seven twist values 
located at various stations along the span of the blade as well as twenty-one thickness and forty-eight camber 
variables distributed across the blade planform. Thinning of the blade is not allowed. 

The optimization is performed using 240 computational nodes or a total of 2,880 processing cores. In 
this environment, individual flow-field and adjoint solutions require 6.5 and 6 hours of wall-clock time, 
respectively. Approximately 950 gigabytes of disk space are required to store a complete flow-field solution 
and its associated domain connectivity data. The package described in Ref. 63 is used to perform the 
optimization. 

The convergence history for the optimization is shown in Fig. 8. The objective function has been reduced 
from its initial value of 69.4 to a final value of 58.7. The final profile for the torque coefficient is included as 
the dashed line in Fig. 5. The mean value Cq measured over the second revolution is 0.00159, an increase 
of 22% over the baseline value. Cross-sections of the baseline and final blade geometries are shown in Fig. 
9. The optimization has increased the thickness across much of the span, while also increasing the negative 
camber in the trailing edge region. 

The optimization procedure for the current test case required nine flow solutions and eight adjoint 
solutions, for a total of 307,000 CPU hours or 4.5 days of wall-clock time. Although not done for the 
wind turbine demonstration, practical constraints such as root-bending moment or thrust constraints are 
straightforward to incorporate as shown in Section X.C. 

X.B. Biologically-Inspired Flapping Wing 

The next test case is based on a simple wing configuration undergoing a complex kinematic motion inspired 
by insects such as the Hawkmoth manduca sextaf'^ Such concepts are receiving considerable attention in 
applications to micro air vehicles.®^ The geometry consists of a rectangular flat plate with semi-circular 
leading and trailing edges and an aspect ratio of 3.33. The mesh system used for this example has been 
generated using the approach outlined in Ref. 66. The component mesh containing the wing geometry 
consists of 3,016,149 nodes and 17,642,078 tetrahedral elements. The background mesh containing the plane 
of symmetry and outer boundaries consists of 5,339,195 nodes and 31,446,042 tetrahedral elements, yielding 
a composite mesh with 8,355,344 nodes and 49,088,120 tetrahedral elements. A nearfield view of the wing 
surface mesh is shown in Fig. 10. 

The baseline wing is offset 1.33 chord lengths from the plane of symmetry and is assumed to be operating 
in quiescent conditions. The imposed motion is achieved through the user-defined kinematics interface 
described above. Here, time-varying angles describing rotations about the x-, y-, and z-axes are specified in 
the following general form: 

0X = Ax[cOs{uiixt) — 1 ] + BxSm{uj2xt), 

9y = Ay[cos{u)iyt) -l]+ By sm{u}2yt), (45) 

9z = A2[cos(a;izf) — 1] -l- BzSm{L02zt), 

where the amplitudes and frequencies are specified by the user. These angles are used to construct a series 
of rotation matrices of the form given by Eq. 20. These matrices are then multiplied together to form the 
final rotation matrix used to specify the current wing position. 


15 of 37 


American Institute of Aeronautics and Astronautics 



In the current example, the baseline motion is a superposition of two oscillatory rotations, each occurring 
at 26 Hz. The first rotation is a sweeping motion that rotates the wing ±60° about its root chord line. The 
second rotation is a feathering motion that rotates the wing ±45° about its leading edge. The net effect of 
this composite motion is a thrust force in the direction from trailing edge to leading edge. Several snapshots 
of the wing undergoing a period of the baseline motion are shown in Fig. 11. 

The Reynolds number based on the wing chord and maximum tip speed is 1,280. The governing equations 
are the incompressible laminar Navier-Stokes equations. The BDF2opt time integration scheme is used with 
fifty subiterations and a physical time step corresponding to 250 steps per period of the baseline motion. 
Each simulation is run for 1,250 time steps and is performed using 160 computational nodes or a total of 
1,920 processing cores. Approximately 850 gigabytes of disk space are required to store a complete flow-field 
solution and its associated domain connectivity data. Individual flow-field and adjoint solutions require 
roughly four and three hours of wall-clock time, respectively. The baseline thrust profile exhibits a two-per- 
cycle periodic behavior as shown by the solid line in Fig. 12. The mean value of the thrust coefficient Ct 
measured over the final period is 0.127. 

The goal of the two test cases presented here is to maximize the thrust coefficient over the final 250 time 
steps by optimizing the fifteen design parameters describing the kinematic motion of the wing, namely the 
frequencies, amplitudes, and coordinates of the center of rotation for the composite motion described above. 
Both of the optimizations have been performed using the package described in Ref. 67. The first test case 
uses an objective function based on a target thrust distribution; 

1,250 

Uj= Y. (C? - (46) 

n=l,001 

The second test case uses an objective function which aims to match a single target value for the time- 
averaged value of thrust: 

^=[( 5 ^ £ A(. (47) 

In each case, the target value of 5.0 has been chosen based on the initial thrust profile shown in Fig. 
12. Although not shown, physical constraints such as power constraints can also be incorporated in a 
straightforward fashion. 

The convergence history for the objective function based on a target distribution is shown by the square 
symbols in Fig. 13. The value has been steadily reduced from 729 to 706 over ten design cycles. Inspection 
of the final values of the design variables shown in Table 4 reveals moderate changes to all parameters. 
The final thrust profile is included as the dashed line in Fig. 12. The optimization has not only increased 
the magnitude of the peaks, but has also altered the frequency content such that three peaks now occur 
within the interval used to define the objective function. The mean value of the thrust coefficient over the 
final 250 time steps is 0.207, a 63% increase over the baseline value. For this test, the optimizer requested 
twenty-two flow solutions and ten adjoint solutions, requiring approximately 227,000 CPU hours or five days 
of wall-clock time. 

The results based on the time-average objective function are included in Fig. 12 as the dash-dot line. As 
in the previous case, the frequency of the signal has been altered to yield three peaks within the objective 
function interval. The mean value of the thrust coefficient over the final 250 time steps has been increased to 
0.265, a 109% increase over the baseline value. The objective function history is plotted in Fig. 13, where it 
can be seen that the value has been reduced from 2.92 to 2.75 over eight design cycles. Here, the optimizer 
requested twenty-five flow solutions and eight adjoint solutions, requiring 238,000 CPU hours or just over 
five days of wall-clock time. 

It should be noted that a series of shape optimizations were also attempted for the current test problem, 
but are not presented here. A total of eighty-eight shape parameters describing the twist, shear, thickness, 
and camber of the wing were used. In general, any shape modification yielding a thrust improvement over 
one half of the period was seen to be equally detrimental to performance during the opposite half, as each 
wing surface alternates between pressure and suction conditions. Other forms of shape modification such as 
planform effects could prove beneficial, although such changes have not been explored here. 
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X.C. UH-60A Blackhawk Helicopter 


The final test case is based on the UH-60A Blackhawk helicopter configuration.®® Extensive analysis of 
this configuration has previously been performed using the solver employed in the current study.®® The 
composite grid system used here consists of four identical blade component grids and a single component 
grid containing the fuselage and outer extent of the computational domain. Each of the blade grids consists 
of 1,266,525 nodes and 7,476,818 tetrahedral elements, while the fuselage grid contains 4,196,841 nodes and 
24,735,227 tetrahedral elements. This results in a composite grid system consisting of 9,262,941 nodes and 
54,642,499 tetrahedral elements. The surface mesh for the configuration is shown in Eig. 14. 

The governing equations are the compressible Reynolds-averaged Navier-Stokes equations. The simula¬ 
tion is based on a forward flight condition with a blade tip Mach number equal to 0.6378 and a Reynolds 
number of 7.3 million based on the blade tip chord. The advance ratio is 0.37 and the angle of attack is 
0°. The rotor blades are subjected to a time-dependent pitching motion that is modeled as a child of the 
azimuthal rotation and is governed by a sinusoidal variation based on collective and cyclic control inputs; 

9 = 9c + 9icCostl) + 9isSinip. (48) 

Here, 9 is the current blade pitch setting, ^ is the current azimuth position for the blade, 9c represents the 
collective control input, and 9ic and 9is are the lateral and longitudinal cyclic control inputs, respectively. 
All three control inputs are set to 0° at the baseline condition; i.e., the vehicle is initially untrimmed. 

The BDE2opt time integration scheme is used with fifteen subiterations and a physical time step corre¬ 
sponding to 1° of rotor rotation. The simulation is run for two rotor revolutions using 160 computational 
nodes or a total of 1,920 processing cores. In this environment, a single execution of the flow and adjoint 
solvers requires two and three hours of wall-clock time, respectively. Approximately 650 gigabytes of disk 
space are required to store a complete flow-field solution and its associated domain connectivity data. 

Eigure 15 shows an isosurface of the Q criterion®^ after two rotor revolutions. The vortices emanating 
from each blade tip and other surfaces of the vehicle are clearly visible. Profiles of the baseline lift and lateral 
and longitudinal moment cofficients are shown as the solid lines in Eigs. 16-18. The values quickly establish 
a four-per-revolution periodic behavior after 180° of blade rotation. The mean value of the lift coefficient 
over the second rotor revolution is 0.023. The untrimmed flight condition is clearly evident in the nonzero 
mean values for the two moment coefficients. 

The objective for the current test case is to maximize the lift acting on the vehicle while satisfying explicit 
constraints on the lateral and longitudinal moments such that the final result is a trimmed flight condition. 
The design variables consist of 64 shape parameters describing the rotor blades, including an 8x4 matrix 
of 32 thickness variables and 32 camber variables as shown in Eig. 19. While the camber is allowed to 
increase or decrease, no thinning of the blade is allowed. In addition, Eq. 48 and its relationship to the blade 
pitch transform matrix are also linearized, allowing the control variables 9c, 9\c, and 9\s to also be used as 
design variables. These control angles are allowed to vary as much as ±7°. Note that parameters describing 
geometric changes to the fuselage could also be applied; however, without guidance for practical constraints 
on such changes, such variables are not used here. 

The objective function to be minimized is based on the time-averaged value of the lift coefficient over the 
second rotor revolution; 

The target value of 2.0 has been chosen based on the initial lift profile. The explicit constraints on the two 
moment coefficients are also based on time-averaged values over the same interval; 

i E (50) 

^ n^361 
^ 720 

»= 360 

The constraints are considered satisfied if g\ = gi = 0, within a feasibility tolerance of ±0.0001. The 
optimization is performed using the package described in Ref. 63. Note that the treatment of the moment 
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constraints requires two additional adjoint solutions to compute the associated gradient vectors. These 
additional solutions are obtained simultaneously with the adjoint computation for the lift objective using 
the procedure outlined in Ref. 24 to accommodate multiple right-hand side vectors in Eqs. 34-36. 


X.C.l. Design Results 

Figure 20 shows the convergence of the objective function and constraints after three design cycles. The 
optimization procedure quickly locates a feasible region in the design space based on the two moment 
constraints and the value of the objective function is successfully reduced. The final unsteady lift profile is 
included as the dashed line in Fig. 16. The mean value has been substantially increased to a value of 0.103. 
The final unsteady profiles for the lateral and longitudinal moment coefficients are included as the dashed 
lines in Figs. 17 and 18, respectively. Fach of the new profiles has the desired zero mean value, indicating 
that the final design is trimmed for level flight within the requested tolerance. 

Based on the spanwise blade stations noted in Fig. 19, cross-sections of the initial and final blade 
geometries are shown in Fig. 21. The shape changes are confined to the aft sections of the outer portion 
of the blade, where the camber has been increased. The final value of the collective input 9c is 6.71°, while 
the final values for the cyclic inputs 9ic and 9is are 2.58° and -7.00°, respectively. The entire optimization 
procedure requiring four flow solutions and four adjoint solutions took approximately 20 hours of wall-clock 
time, or 38,400 CPU hours. 

X.C.2. Interpretation of the Adjoint Solution 

Typical qualitative features of unsteady adjoint solutions are shown in Fig. 22 for the objective function 
given by Eq. 49. The figure depicts centerline contours of the adjoint solution for the energy equation at 
time level n = 420. The contours represent the instantaneous sensitivity of the objective function to a source 
term applied to the energy equation at each point in the domain. Similar to steady-flow objective functions 
based on surface integrals,®®^^^ the time-averaged value of the lift is particularly sensitive to information 
propagating along the stagnation streamline and impacting the nose of the fuselage. In addition. Fig. 22 
highlights several features emanating from the rotor blades as they pass through the cutting plane. These 
features are loosely analogous to unsteady flow phenomena such as vortex sheets and tip vortices commonly 
seen in forward solutions for rotor flows as shown in Fig. 15. However, unlike the forward problem, the 
features shown in the adjoint solution propagate in the upstream direction as the adjoint system is integrated 
in reverse physical time, indicating the sensitivity of the objective function to disturbances upstream. 

In design optimization, the adjoint solutions are combined with the linearizations of the residual operators 
with respect to design variables to yield sensitivity derivatives. Alternatively, the adjoint solutions may be 
combined with local residuals to provide rigorous error estimates or with (local estimates of) the truncation 
errors to guide mesh adaptation. Although these applications are not the focus of the current work, adjoint- 
based adaptation methodologies^"^ offer many compelling advantages over traditional feature-based mesh 
adaptation techniques which fail to identify important regions such as those containing the upstream features 
highlighted in Fig. 22. 


XI. Summary and Future Work 

A general verified methodology for adjoint-based design optimization of unsteady turbulent flows on dy¬ 
namic overset unstructured mesh systems has been presented. The formulation is valid for compressible and 
incompressible forms of the Reynolds-averaged Navier-Stokes equations. The implementation is amenable 
to massively parallel computing environments and has been verified through the use of an independent tech¬ 
nique based on a complex-variable formulation. Several large-scale optimizations have been demonstrated 
for complex flowfields involving a wind turbine configuration, a flapping wing, and a realistic helicopter 
geometry subject to trimming constraints. The objective functions have been successfully reduced in each 
case and all constraints present have been satisfied. 

Although the demonstrated methodology provides a practical approach to optimization of general un¬ 
steady aerodynamic flows, a wide range of research topics remains to be explored. Locally optimal, 
reduced-order model,^"^ and checkpointing^® techniques offer the potential to greatly reduce storage require¬ 
ments. Multi-fidelity optimization algorithms^® should be exploited where possible to reduce dependence 
on high-fidelity simulations. Convergence acceleration techniques^® can clearly have a direct impact on 
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computational cost. Simultaneous adjoint-based error estimation and mesh adaptation approaches'*^ are 
very attractive in establishing rigorous gridding requirements and eliminating user interaction. Extension of 
adjoint-based methods to multidisciplinary optimization beyond the scope of computational fluid dynamics 
is essential for making significant impacts on the current paradigm for design of aerospace vehicles and other 
areas of applications. Finally, advancements in the fields of computer science, software development, and 
high-performance computing must continue to be leveraged to the greatest extent possible. 
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Appendix: Adjoint Eqnations for Higher-Order BDF Schemes 

Discrete conservation laws employing high order temporal BDF schemes as introduced in Eq. 6 are 
defined as 

o; o , v; + 

+R" + ((I?Q"-i) o C" + /3C-) o nicL = 0- 

Proceeding as before, the Lagrangian can be written as 

L (D, Q, X, A, Ag) = f At+J2 [A”] ^ G"Af 

+ E {[C? o o V?-ho (I-V"-2) 

^ (53) 

+ [A?]^ [R™ + ((I^Q"-i) o + /3C?) o Rg^^] + [a^]^ [A"Q-] + [A^f [P-Q"]| At 
+ (/° + [A°] ^ GO + [AO] ^ R-) At. 

On time levels 1 and 2, the time derivatives are assumed to be discretized with the BDFl and BDF2 schemes, 
respectively. Taking into account the dependencies on data at time levels n—2 and n—3, the adjoint equations 
are obtained as follows; 


19 of 37 


American Institute of Aeronautics and Astronautics 



5 : o A? + [fi;] ^ + [A^]^ A] + [P-f A- = 

- [^] ^ - 1^ { [1^1] ^ [(-^ v«+i - + Pr+c\) o o A^ 1] 

+ [(siI”+^V") oC^+2oA^+2] + [1^+3]^ [(^I”+3V") oC^+3oAJ+3]} , 

f:[if]^A;+[Aj]^A;+[pyf a; = 

- [f%r - 1 ? { p?”] "^ [(- sV?” - sir'V"-! - ^ 1 ;+'V"-" + R“« ) o c;+‘ o a;*'] 

+ [ 1 ^+ 2 ]^ [(^ir^V") O C ^+2 O A^+ 2 ] + [ 1 ^+ 3 ]^ [(sir^V") O C ^+3 O A”+ 3 ] } , 

h-.[^Yk + iKf a 7 + [p;:]^ Ai = 

- [^] ^ - 15 : {[!?+']^ [(-sV”+' - ^I?+'V"-3 - ^I^+3V-2 + ni+c\) o c^3 o a?+3] 

+ [1^+2]'^ [(^I^+2V”) oC^+2oA^+2] + [1^+3]^ [(^I”+3V") oC^+3oAJ+3]} , 


for3<n<N-, 

S : sit V? o 0 a: + [§§;] ^ A? + [A^]^ A] + [P^f Al = 

- [^] ^ - I" { ^ [(-it - it- ^iriV-2 + Rg+i^) o C^3 o A^i] 

+ [^s^'^r [(stI?+^V") oC^+2oA"+ 2] + [l”+3]^ [(^I^+3V") oC™+3oAJ+3]} , 


■■m\ f+[^i\ ^i+[P7\ K= 

- [t%] ^ - 1/ { [I^']^ [(-itV?+' - + Rg+l^) O o A?+3] 

+ [l”+2]'^ [(^I^+^V") oC^+2oA"+ 2] + [l™+3]^ [(^I^+3V") oC™+3oA”+3]} , 


A” + [A^]^Ay+[P^]'^A5J- 

-[MiT-'^h{ [!?+']^ [(-it Vr' - - ^I^+iV -2 + RS+\,) o C ^3 o A^+i] 

+ [(itI"+^V") oC^+2oA™+2] + [1™+^]^ [(^I^+3V") oC™+3oA”+3]} , 


for n = 2; 
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s: iv; o c; o a; + [f§] ’■ a; + |a;|’' aj + [p;]’" a; = 

- -1; {[(-ivr' - jijir'v-i+ rj« ) ocr>» a;+i] 

+ [(stI”+^V”) oC^+2oA”+ 2] + [l™+3]^ [(^I^+3V") oC^+3oA”+3]} , 

f:[iS;]^A:+[Aj]^A;+[pj]^A|; = 

- -1/ { R+'l’’ [(-±v;+> - 5 fci;+'v"-' + RJ« ) o c;+‘ o a;+'] 

+ [1;+'^]’’ [(^i;+'‘v") oc;+“c. a;+^] + [i;+“]’' [(^i;+’v") o c;+“ o a;+>] }, 

A" + [a;^]^A^ + [p;^]^Aij = 

- [^] ’' -'»{[(-±vr' - jiiir'v-i+Rj«) o cr* •> a;+i] 

+ [1;+“]’’ a;+^] + [!;+“]’' [(^i;+’v") o c;+“ o a;+>] }, 


[ w] - [so"] ^ - {wr [(* + ''■oci) o c; o a;] 

+ [«]’■ [(ifei?v») o c» o a 5] + [ig^ [(iii>v») o Cf o Af]}, 


for n = 0. 
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The corresponding grid adjoint equations are obtained as follows. Assuming = A''^+^ = A^+^ = 0: 



+ If^ 0 o A3) + [1^]" AO 

+ t [fl^ + ((I"Q'=“') ° CJ + pc’!) © 5^] ^A>!+[^Y, for n = 0. 


(57) 

The sensitivity derivative for the higher-order BDF schemes is evaluated using Eq. 36. 
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Table 1. 

Coefficient 

;s for BDF scher 

nes. 

Scheme 

a 

b 

c 

d 

BDFl 

1 

-1 

0 

0 

BDF2 

3/2 

-2 

1/2 

0 

BDF3 

11/6 

-3 

3/2 

-1/3 

BDF2opt 

1.66 - 

-2.48 

0.98 

-0.16 


Table 2. Values of the sensitivity derivative for different design variables and temporal discretizations for 

compressible flow. The symbols A and C denote adjoint and complex-variable results, respectively. Discrepancies are 
shown in bold and underlined. 


Variable 


BDFl 

BDF2 

BDF2opt 


BDF3 

Angle of 

A: 

0.116458961683733 

A: 0.102099965021956 

A: 0.102915752531413 

A: 

0.103785048456802 

Attack 

C: 

0.116458961683734 

C: 0.102099965021956 

C: 0.102915752531413 

C: 

0.103785048456802 

Rot Rate 

A: 

0.619149219921508 

A: 0.609270815829788 

A: 0.592456231940897 

A: 

0.575091540944799 

Blade 1 

C: 

0.619149219933539 

C: 0.609270815842755 

C: 0.592456231953869 

C: 

0.575091540957581 

Shape 

A: 

0.056440771725301 

A: 0.064382783171893 

A: 0.062734653842921 

A: 

0.060943525618014 

Blade 2 

C: 

0.056440771725196 

C: 0.064382783171802 

C: 0.062734653842842 

C: 

0.060943525617920 

Flap Freq 

A: 

-0.414712919056299 

A: -0.337250987004676 

A: -0.3445555132674M 

A: 

-0.352419586848976 

Blade 3 

C: 

-0.414712919056270 

C: -0.337250987004642 

C: -0.344555513267474 

C: 

-0.352419586848961 

Rot Rate 

A: 

6.86680217888885 

A: 7.42798143738984 

A: 7.31688305983601 

A: 

7.20812218587293 

Fuselage 

C: 

6.86680217888239 

C: 7.42798143738254 

C: 7.31688305982953 

C: 

7.20812218586623 

Trans Rate 

A: 

0.420300051382122 

A: 0.400837175635065 

A: 0.390973864106570 

A: 

0.379952931745697 

Fuselage 

C: 

0.420300051369376 

C: 0.400837175622066 

C: 0.390973864093789 

C: 

0.379952931733500 

Shape 

A: 

-0.007809447236753 

A: -0.009590444345683 

A: -0.009613538492229 

A: 

-0.009705401931920 

Fuselage 

C: 

-0.007809447236691 

C: -0.009590444345727 

C: -0.009613538492351 

C: 

-0.009705401931704 
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Table 3. Valu 
incompressible 
shown in bold 

Bs of the sensitivity deriv 
flow. The symbols A and 
and underlined. 

ative OCts/SD for differe 

nt design variables and ten 
plex-variable results, respe 

rpor 

al discretizations for 

Variable 

BDFl 

BDF2 

BDF2opt 


BDF3 

Angle of 

A: 0.000195945789030 

A: 0.000234143173131 

A: 0.000218182269639 

A: 

0.000191641169710 

Attack 

C: 0.000195945789030 

C: 0.000234143173131 

C: 0.000218182269639 

C: 

0.000191641169711 

Rot Rate 

A: 0.0095180739768M 

A: 0.010325090376673 

A: 0.010544987182945 

A: 

0.010757597020150 

Blade 1 

C: 0.009518073976838 

C: 0.010325090376647 

C: 0.0105449871829^ 

C: 

0.0107575970201M 

Shape 

A: 0.000535025241509 

A: 0.000607314158464 

A: 0.000618811948355 

A: 

0.000633736751875 

Blade 2 

C: 0.000535025241508 

C: 0.000607314158463 

C: 0.000618811948355 

C: 

0.000633736751875 

Flap Freq 

A: -0.004866399384562 

A: -0.004825188859067 

A: -0.004821787992149 

A: 

-0.004810632891273 

Blade 3 

C: -0.004866399384562 

C: -0.004825188859067 

C: -0.004821787992149 

C: 

-0.004810632891273 

Rot Rate 

A: 0.042649260159755 

A: 0.044962632318017 

A: 0.044947751807594 

A: 

0.044876653248215 

Fuselage 

C: 0.042649260159807 

C: 0.044962632318090 

C: 0.044947751807680 

C: 

0.044876653248312 

Trans Rate 

A: 0.010034159304733 

A: 0.010404514410124 

A: 0.010284602229241 

A: 

0.010043806857134 

Fuselage 

C: 0.010034159304771 

C: 0.010404514410192 

C: 0.010284602229293 

C: 

0.010043806857193 

Shape 

A: 0.000087061995334 

A: 0.000079589134812 

A: 0.000082271937020 

A: 

0.000086753178814 

Fuselage 

C: 0.000087061995336 

C: 0.000079589134815 

C: 0.000082271937019 

C: 

0.000086753178823 


Table 4. Values of the initial and final design variables for the flapping wing configuration. The abbreviation COR 
denotes the center of rotation. 


Variable 

Baseline 

Distribution Target Function 

Time-Average Target Function 

x-COR 

0.000 

0.025c 

0.027c 

y-COR 

0.000 

-0.119c 

-0.114c 

z-COR 

0.000 

0.011c 

0.012c 

Aa: 

0.00 

0.77 

-0.11 


45.00 

45.13 

45.25 


163.36 

163.45 

163.36 

^2x 

163.36 

177.47 

192.77 

Ay 

0.000 

0.30 

-0.99 

By 

0.000 

-1.50 

-0.26 

WXy 

163.36 

162.76 

163.15 

UJ2y 

163.36 

163.10 

162.97 

A, 

-60.00 

-62.71 

-62.83 


0.00 

0.69 

-1.55 


163.36 

173.59 

189.57 

^2z 

163.36 

164.41 

163.55 
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Figure 10. Surface mesh for flapping wing case. 




(a) First half of period. 


(b) Second half of period. 


Figure 11. Snapshots of haseline flapping wing motion. 
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